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SARS coronavirus main protease (SARS-CoV MP") is essential for the replication of the virus and regarded 
as a major antiviral drug target. The enzyme is a cysteine protease, with a catalytic dyad (Cys-145/His-41) 
in the active site. Aldehyde inhibitors can bind reversibly to the active-site sulfhydryl of SARS-CoV MP"°. 
Previous studies using peptidic substrates and inhibitors showed that the substrate specificity of SARS- 
CoV MP’? requires glutamine in the P1 position and a large hydrophobic residue in the P2 position. We 
determined four crystal structures of SARS-CoV MP'° in complex with pentapeptide aldehydes (Ac-EST- 
LQ-H, Ac-NSFSQ-H, Ac-DSFDQ-H, and Ac-NSTSQ-H). Kinetic data showed that all of these aldehydes exhi- 
bit inhibitory activity towards SARS-CoV MP"°, with K; values in the 1M range. Surprisingly, the X-ray 
structures revealed that the hydrophobic S2 pocket of the enzyme can accommodate serine and even 
aspartic-acid side-chains in the P2 positions of the inhibitors. Consequently, we reassessed the substrate 
specificity of the enzyme by testing the cleavage of 20 different tetradecapeptide substrates with varying 
amino-acid residues in the P2 position. The cleavage efficiency for the substrate with serine in the P2 
position was 160-times lower than that for the original substrate (P2 = Leu); furthermore, the substrate 
with aspartic acid in the P2 position was not cleaved at all. We also determined a crystal structure of 
SARS-CoV MP’ in complex with aldehyde Cm-FF-H, which has its P1-phenylalanine residue bound to 
the relatively hydrophilic S1 pocket of the enzyme and yet exhibits a high inhibitory activity against 
SARS-CoV MP"°, with K; = 2.24 + 0.58 uM. These results show that the stringent substrate specificity of 
the SARS-CoV MP"° with respect to the P1 and P2 positions can be overruled by the highly electrophilic 
character of the aldehyde warhead, thereby constituting a deviation from the dogma that peptidic inhib- 
itors need to correspond to the observed cleavage specificity of the target protease. 

© 2011 Elsevier B.V. All rights reserved. 
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1. Introduction 


SARS coronavirus main protease (SARS-CoV MP"°) is essential 
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for the replication of the virus and regarded as a major antiviral 
drug target (Anand et al., 2003; Yang et al., 2003; Tan et al., 
2005; Steuber and Hilgenfeld, 2010). Peptide aldehyde inhibitors 
are widely used as research tools to characterize the substrate 
specificity of serine and cysteine proteases. In complex with the 
crystalline target enzyme, they provide a wealth of information 
on the specificity-defining subsites of the substrate-binding site. 
For this purpose, we have previously described peptide aldehydes 
as inhibitors of the SARS-coronavirus main protease (SARS-CoV 
MP'°) (Al-Gharabli et al., 2006). A major advantage of aldehydes 
over other peptide electrophiles is their reversible binding to the 
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active-site sulfhydryl of cysteine proteases. We have previously 
used this property to screen a small library of non-peptide nucleo- 
philes (mostly amines) for molecules that would efficiently en- 
hance the inhibition of the aldehyde by formation of a stronger 
binding aldehyde-nucleophile ligation product. In a second step 
of this process that we named “Dynamic Ligation Screening” 
(Schmidt et al., 2008), the most efficient compound in this process 
had its amino group replaced by an aldehyde moiety and was re- 
acted with the target enzyme, and the resulting covalent adduct 
was subsequently used to screen the same library of nucleophiles 
for those that would react with the aldehyde. This way, starting 
from a substrate-like peptide aldehyde, we obtained a low-uM 
non-peptidic, reversible inhibitor from two fragments which by 
themselves had no or only very low inhibitory activity (Schmidt 
et al., 2008). 

Interestingly, it turned out that some peptide aldehydes with an 
amino-acid sequence deviating from the consensus sequence 
embracing the cleavage sites of polyprotein substrates were sur- 
prisingly efficient as inhibitors. In particular, peptide aldehydes 
with P2 = Asp or Ser inhibited the main protease with a relatively 
low ICso (Al-Gharabli et al., 2006), although it is generally agreed 
that the S2 specificity subsite of the enzyme has a strong prefer- 
ence for large hydrophobic side-chains such as Leu, Phe, or Met 
(Fan et al., 2004, 2005; Lai et al., 2006). We therefore assumed that 
the hydrophilic P2 side-chain of these inhibitors would be oriented 
towards the solvent, rather than occupy the S2 pocket, and that the 
hydrophobic P3 residue would be binding to that pocket 
(Al-Gharabli et al., 2006; Schmidt et al., 2008). This model was 
based on a crystal structure of a peptidyl chloromethyl ketone 
bound to the SARS-CoV main protease, in which this arrangement 
had been observed (Yang et al., 2003). Here we show by X-ray crys- 
tallography of a number of peptide aldehyde complexes with the 
protein that this is not the case. Instead, the hydrophilic Ser or 
Asp residues in the P2 position bind to the hydrophobic S2 subsite. 
In order to understand these binding modes, the atomic interac- 
tions were analyzed in detail. Also, we report the inhibition kinet- 
ics of these compounds and re-evaluate the cleavage specificity of 
the enzyme as far as the P2 position is concerned. In addition, we 
determined a crystal structure of SARS-CoV MP'° in complex with 
the aldehyde Cm-FF-H (cinnamoyl-Phe-Phe-H), which has a phen- 
ylalanine residue in the P1 position and exhibits high inhibitory 
activity against SARS-CoV MP", with K; = 2.24 + 0.58 uM. These re- 
sults show that the stringent substrate specificity of the SARS-CoV 
MP? with respect to the P1 and P2 positions can be overruled by 
the highly electrophilic character of the aldehyde warhead. 


2. Materials and methods 
2.1. Synthesis of aldehydes with various P2 residues 


Chemical synthesis of peptide aldehydes Ac-ESTLQ-H, 
Ac-NSFSQ-H, Ac-DSFDQ-H, and Ac-NSTSQ-H was performed 
employing a solid-state method (Al-Gharabli et al., 2006). Briefly, 
protected glutamine aldehyde obtained by racemization-free oxi- 
dation of the corresponding amino alcohol with Dess—Martin peri- 
odinane was immobilized on a threonyl resin as oxazolidine. 
Following N-tert-butyl oxycarbonyl (Boc)-protection of the ring 
nitrogen to yield the N-protected oxazolidine linker, peptide syn- 
thesis was performed on the resin. 


2.2. Synthesis of Cm-FF-H (Scheme 1) 


Synthesis of Cm-FF-H was carried out by amidation of 
Boc-t-phenylalanine with t-phenylalanine methyl ester followed 
by deprotection of the Boc group and acylation of the corresponding 


product with cinnamoyl chloride to provide N-cinnamoyl-.-phenyl- 
alanyl-.-phenylalanine methyl ester. Cm-FF-H was then obtained 
by reduction of the ester with NaBH,/CaCl, and alcohol oxidation 
with IBX. 


2.3. Enzyme kinetics 


The recombinant production and purification of SARS-CoV MP'° 
with authentic N and C termini were performed as described pre- 
viously (Xue et al., 2007; Verschueren et al., 2008). The substrate 
Dabcyl-KTSAVLQ|SGFRKME-(Edans)-amide (95% purity; Biosyntan 
GmbH, Berlin, Germany), which contains a main-protease cleavage 
site (indicated by the arrow), was used as the substrate in the fluo- 
rescence resonance energy transfer (FRET)-based cleavage assay. 
The substrate stock was prepared by dissolving 1 mM of the pep- 
tide in DMSO. The dequenching of the Dabcyl fluorescence due to 
the cleavage of the substrate as catalyzed by the SARS-CoV MPr° 
was monitored at 490 nm with excitation at 340 nm, using a Cary 
Eclipse fluorescence spectrophotometer. The experiments were 
performed in the buffer consisting of 20 mM Tris-HCl (pH 7.3), 
100 mM NaCl, and 1 mM EDTA. The reaction was initiated by add- 
ing different final concentrations of the FRET peptide (10-50 uM) 
to a solution containing SARS-CoV MP'° (final concentration 
0.5 uM). Kinetic constants (Vmax and Km) were derived by fitting 
the data to the Michaelis-Menten equation, V=Vymax x [S]/ 
(Km+[S]). Then kcat was calculated according to the equation, 
Keat = Vmax/[E]. In order to estimate the population of catalytically 
active MP'° dimers, the respective monomer and dimer concentra- 
tions were calculated according to Graziano et al. (2006). The 
dimerization of the SARS-CoV MP"° follows the scheme 
M+Ms=D 
The equilibrium dissociation constant, Kp, is defined by 

[M}’ 

Kp [Dj , 


where [M] and [D] are the molar concentrations of the monomer 
and dimer, respectively. The total protein concentration [M;] ex- 
pressed in terms of molar monomer equivalents is 


[M]; = [M] + 2[D]. 
Since 
_ (Mj; - (M) 

2 > 


substituting the above equation into the expression for the Kp yields 


ky = 2M? 


[M], — [M]° 


[D| 


Solving the above quadratic equation for [M] gives 


= —Kp + \/Kp + 8[M]-Kp 


4 
It then follows that 


Kp + 4(M], — \/K?, + 8[M],Kp 


8 


2.4, Aldehyde inhibition assay 


Aldehyde stock solutions were prepared by dissolving the com- 
pounds in DMSO at 1 mM. All aldehydes form covalent bonds be- 
tween the aldehyde (-CHO) group of the inhibitor and the 
sulfhydryl (-SH) group of Cys145 of SARS-CoV MP"°. The binding 
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Scheme 1. Synthesis of Cm-FF-H. Reagents and conditions: (a) EDCI, HOBt, Boc-Phe-OH, DMF, 0-25 °C, 18 h, 78%; (b) (i) TFA, CHCl, 0-25 °C, 2 h; (ii) cinnamoyl chloride, N- 
methylmorpholine, THF, DMF, 0-25 °C, 12 h, 50%; (c) NaBH, CaClz, EtOH, THF (1:1), 0-25 °C, 14h, 35%; (d) IBX, DMSO, 0-25 °C, 5 h, 47%. 


of these inhibitors is reversible (Schmidt et al., 2008), although it is 
strong. Therefore, they were treated as reversible tight-binding 
inhibitors. For the measurement of the inhibition constant Kj, 
SARS-CoV MP'° was incubated with the aldehyde inhibitor in reac- 
tion buffer at room temperature for 30 min. Then the FRET peptide 
substrate, Dabcyl-KTSAVLQ|SGFRKME-Edans, was added and the 
reaction velocity was calculated according to substrate cleavage. 
Values of the intrinsic (V?) and apparent (V?"?, K???) catalytic 
parameters for SARS-CoV MP" catalyzing the hydrolysis of peptide 
substrate were determined in the absence and presence of 
aldehyde, respectively. The apparent inhibition constants (K;??) 
for aldehyde binding to SARS-CoV M?"° were obtained from the 
dependence of V;"? on the inhibitor concentration ([I]) at fixed 
substrate concentration ([S]), according to the equation Vj?? = 
ViPP x [I]) /KjPP + [I]) (Ascenzi et al., 1987; Copeland, 2000). Values 
of the intrinsic inhibition constant (K;) for aldehyde binding to 
SARS-CoV MP"? were calculated according to the equation K?"? = K; 
x (1 + [S]/Km) (Ascenzi et al., 1987; Copeland, 2000). 


2.5. Peptide cleavage assay 


Twenty peptide substrates harboring alternative amino-acid 
residues in P2 (=X; SWTSAVXQ|SGFRKWA) were purchased from 
GL Biochemistry Ltd. (Shanghai, China). These peptides correspond 
to the N-terminal autocleavage site of the SARS-CoV Mpro, with 
the exception of the P7 Ile which had been replaced by Trp, and 
the P6’ Met which had also been replaced by Trp. All peptide sub- 
strate stocks were dissolved in DMSO at 10 mM. To determine the 
kcat/Km for the substrate, 0.2 mM substrate peptide was incubated 
with 10 iM SARS-CoV Mpro in 40 mM Tris-HCl buffer, pH 7.3. Ali- 
quots of reactions were removed at different times, stopped by the 
addition of 1% trichloroacetic acid. Separation of products and sub- 
strate was carried out using a reverse-phase (RP) HPLC and a linear 
gradient (1-90%) of acetonitrile in 0.1% trifluoroacetic acid. The 
absorbance was determined at 280 nm, and peak areas were calcu- 
lated by integration. The (kcat/Km)app ratio was determined by plot- 
ting the substrate peak area using the equation InPA=C — (Keat/ 
Km)app X Ce x t, where PA is the peak area of the substrate peptide, 
C; is the total concentration of SARS-CoV MP", and C is an experi- 
mental constant (Fan et al., 2004, 2005). 


2.6. Crystallization of the complexes 


SARS-CoV MP'° with authentic chain termini was concentrated 
to 10 mg/ml and crystallized by vapor diffusion using sitting drops 
(Xue et al., 2007). The crystals grew overnight at 20 °C by equilibra- 
tion against a reservoir containing 6-8% polyethylene glycol (PEG) 
6000, 0.1 M MES (pH 6.0), 3% 2-methyl-2,4-pentanediol (MPD), 
and 3% DMSO. All aldehydes were dissolved in 8% PEG 6000, 0.1 
M MES (pH 6.0), 3% MPD, and 10% DMSO to a concentration of 


10 mM. Crystals of the aldehyde complexes of SARS-CoV MP'° were 
obtained either by adding a 4-l aliquot of aldehyde solution to the 
drop and soaking of the crystals for 12h, or by incubating the 
enzyme for 2 h at 20 °C with a 7-fold excess of the aldehyde solu- 
tion and subsequent cocrystallizing at 20°C against a reservoir 
containing 8% PEG 6000, 0.1M MES (pH 6.0), 3% MPD, and 3% 
DMSO. In the latter case, nucleation was initiated by microseeding 
using crushed monoclinic (space group C2) crystals of the SARS- 
CoV MP'°. Prior to diffraction data collection, the aldehyde complex 
crystals obtained from soaking and cocrystallization were trans- 
ferred for a few seconds to a cryoprotectant solution containing 
the crystallization ingredients and 20% MPD. 


2.7. Crystallographic data collection and processing, and structure 
elucidation and refinement 


All diffraction data were collected at 100 K at the Joint EMBL/ 
University of Hamburg/University of Liibeck synchrotron beamline 
X13 at DESY (Hamburg, Germany), using a 165-mm MAR CCD 
detector (Mar Research, Hamburg, Germany), or at synchrotron 
beamline BL14.1 at BESSY (Berlin, Germany), using an MX225 
CCD detector (Rayonics, Evanston, IL). Statistics of data collection, 
processing, and refinement are summarized in Table 1. Data were 
processed with MOSFLM (Leslie, 1992), and scaled using the SCALA 
program from the CCP4 suite (Collaborative Computational Project, 
1994; Evans, 2006). 

Structure elucidation and refinement were carried out using the 
CCP4 suite of programs (Collaborative Computational Project, 
1994; Potterton et al., 2003). Crystal structures were determined 
by molecular replacement, using the original X-ray structure of 
the SARS-CoV MP'° with authentic chain termini (PDB ID: 2H2Z) 
(Xue et al., 2007) as the initial model. REFMAC (Murshudov et al., 
1997) was employed for structure refinement. The computer 
graphics program Coot (Emsley and Cowtan, 2004) was used for 
interpretation of electron density maps and model building. Gap 
volumes between inhibitor and protein in the subsites of the 
enzyme were calculated using UCSF-Chimera (Pettersen et al., 
2004). The molecular graphics package PyMOL (DeLano, 2002) 
was used to generate the figures. 


3. Results and discussion 
3.1. Inhibition of SARS-CoV MP" by peptide aldehydes 


The kinetic parameters of SARS-CoV MP'° with authentic chain 
termini were determined using the FRET substrate Dabcyl- 
KTSAVLQ|SGFRKME-Edans amide. The K,, value for this substrate 
was determined as 24.5 uM and the apparent k.at/Km value was 
2359 M~'.19s~'. However, the dissociation of the catalytically 
active SARS-CoV MP'° dimer into inactive monomers (Fan et al., 
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Table 1 
Data collection and refinement statistics. 

Inhibitor Ac-ESTLQ-H Ac-DSFDQ-H Ac-NSTSQ-H Ac-NSFSQ-H Ac-ESTLQ-H Cm-FF-H 
Data collection statistics 
Wavelenth (A) 0.8123 0.8123 0.8123 0.8123 0.8123 0.9184 
Method Soaking Soaking Soaking Soaking Cocrystallization Soaking 
Space group c2 C2. C2 C2 P2, C2 
Unit cell dimensions (A, °) 108.91, 108.74, 107.83, 107.87, 52.37, 107.75, 

81.36, 81.86, 82.46, 82.09, 96.79, 82.88, 

53.40 53.21 53.32 53.08 68.07 53.50 

B= 104.35 B= 104.26 B= 104.65 B= 104.30 B= 102.49 B= 104.63 
NP? 1 1 1 1 2 1 
Resolution range (A) 32.21-2.60 32.32-2.40 26.58-2.58 64.56-3.05 27.29-1.89 32.82-1.99 
Number of reflections 13.957 16.009 13.938 8643 50.901 31.388 
Redundancy” 3.8(3.8) 3.4(2.8) 3.0(2.7) 3.4(3.4) 2.8(2.6) 3.6(3.4) 
Completeness (%)? 99.9(100.0) 91.3(85.7) 96.7(82.5) 99.9(100.0) 94.5(78.3) 99.7(98.1) 
Rmerge (%)P 10.2(45.6) 10.4(25.7) 6.0(32.0) 8.3(47.2) 6.7(33.6) 17.0(36.6) 
To? 8.7(2.4) 5.8(2.3) 10.1(2.8) 10.2(2.3) 10.3(2.7) 4.6(2.2) 
Refinement statistics 
R/Réree (%) 18.5/22.5 21.2/23.9 19.8/26.0 19.0/25.5 18.7/23.8 21.7/26.9 
r.m.s deviation from idea geometry 
Bonds (A) 0.010 0.009 0.017 0.015 0.009 0.018 
Angles (°) 1.535 1.162 1.820 1.661 1.172 1.689 
Ramachandran plot 
Most favored (%) 88.0 90.3 83.9 80.6 90.6 89.8 
Allowed (%) 10.9 8.6 14.2 18.3 8.3 9.1 
Generously allowed (%) 0.4 0.7 11 0.4 0.4 0.8 
Disallowed (%) 0.7 0.4 0.7 0.8 0.8 0.4 
PDB ID 3SNE 3SNB 3SNC 3SNA 3SND 3SN8 


* Number of protein molecules per asymmetric unit. 
> Numbers in parentheses are for the outermost resolution shell. 


2004) has to be taken into account. A wide range of Kp values has 
been reported for M?"° dimer dissociation (see Grum-Tokars et al. 
(2008) for an overview). For the enzyme with authentic chain ter- 
mini, the latter authors reported a Kp of 0.25 to 1.0 UM. As the MP"° 
dimer tends to be stabilized by the presence of substrate (Cheng 
et al., 2010), we used the lower limit of this range for estimation 
of the necessary corrections and obtained a keat/Km value of 
7863 M~!s~!. The K,, value determined in our study is of the same 
magnitude as other values derived from data not corrected for dis- 
sociation, which are typically in the range from 10 to 50 uM 
(Grum-Tokars et al., 2008). This suggests that much higher values 
reported previously (e.g., Verschueren et al., 2008) should probably 
be reconsidered. 

Initially, four pentapeptide substrate-analogous aldehydes con- 
taining the canonical P1 residue, glutamine, namely Ac-ESTLQ-H, 
Ac-NSTSQ-H, Ac-DSFDQ-H, and Ac-NSFSQ-H, were tested for inhi- 
bition of SARS-CoV MP"°. Overall, the four analogues were found 
to be reversible tight-binding inhibitors. Harboring the canonical 
P2 Leu, aldehyde Ac-ESTLQ-H exhibited inhibition with a relatively 
low K; of 8.27+1.52 uM. Aldehydes Ac-NSTSQ-H, Ac-DSFDQ-H, 
and Ac-NSFSQ-H, all with a non-canonical P2 residue, exhibited 
moderate inhibition with kK; values of 40.98 + 2.63, 41.244 2.25, 
and 72.73 + 3.60 uM. Surprisingly, aldehyde CmFF-H, carrying a 
cinnamoyl group in the P3 and a Phe residue in the P1 position, 
had an even higher inhibitory activity against SARS-CoV MP"° than 
the four pentapeptide aldehydes, with a K; of 2.24 + 0.58 uM. 


3.2. Overall structures of the aldehyde complexes 


The aldehydes Ac-ESTLQ-H, Ac-NSTSQ-H, Ac-DSFDQ-H, 
Ac-NSFSQ-H, and Cm-FF-H were separately soaked into crystals 
of SARS-CoV Mpro. The crystals were all of space group C2, which 
is often observed for SARS-CoV MP"° (Lee et al., 2005; Xue et al., 
2007; Verschueren et al., 2008). These crystals contain one SARS- 
CoV MP'° monomer per asymmetric unit and the dimer (which is 
the enzymatically active species) is formed through the symmetry 


of the crystal. The four pentapeptide aldehydes Ac-ESTLQ-H, Ac- 
NSTSQ-H, Ac-DSFDQ-H, and Ac-NSFSQ-H are bound in extended 
conformations in the S6-S1 specificity subsites of SARS-CoV MP". 
Cm-FF-H occupies sites S3-S1. Remarkably, the P1 phenylalanine 
side chain of this inhibitor is bound deeply in the $1 pocket, which 
is generally considered to be specific for glutamine. 2F, — F, elec- 
tron density maps of these aldehyde inhibitors are shown in 
Fig. 1. In all complexes, continuous electron density between the 
aldehyde carbonyl C-atom of the inhibitor and Cys145-Sy of 
SARS-CoV MP" (Fig. 1) indicates the formation of a thiohemiacetal, 
as a result of the nucleophilic attack of the catalytic cysteine onto 
the C-terminal aldehyde of the inhibitor. The main-chain confor- 
mations of SARS-CoV MP" in the five complexes are basically iden- 
tical, with overall root mean-square deviations (RMSD) of 
0.16 —0.36A for Coa atoms. In addition to the crystal soaking 
experiments, we also tried to cocrystallize SARS-CoV MP'° with 
the aldehydes. The crystals obtained were predominantly of space 
group P2,, with the exception of the complex with Ac-NSFSQ-H, 
which still displayed space group C2. However, in most of the 
P2, crystals, no electron density for the aldehyde could be 
observed; thus, while the presence of the inhibitors induced a 
change of space group in most cases, the compounds themselves 
were not detected in the crystals. An exception was Ac-ESTLQ-H. 
The crystals of its complex with the M?"° diffracted to 1.89 A, but 
in both enzyme monomers in the asymmetric unit, electron den- 
sity could only be seen for residues P1 (Gln) and P2 (Leu) of the 
inhibitor (Fig. 1E and F). Crystallographic data and refinement 
statistics for all six crystal structures are summarized in Table 1. 


3.3. Dual configurations of the thiohemiacetal in the complex with Ac- 
NSFSQ-H 


In the SARS-CoV MP!° complex with aldehyde inhibitor 
Ac-NSFSQ-H, the thiohemiacetal adopts alternative configurations 
(Fig. 2A and B). This is likely the result of nucleophilic attack by 
Cys145 onto the planar carbonyl from either side. In the (R)-isomer, 
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Fig. 1. Inhibitor binding at the active site of SARS-CoV MP"°. 2F, — F, electron-density maps are shown for inhibitors (A) Ac-ESTLQ-H, (B) Ac-NSTSQ-H, (C) Ac-DSFDQ-H, (D) 
Ac-NSFSQ-H, (E) the visible portion of Ac-ESTLQ-H cocrystallized with the MP, molecule A, and (F) molecule B, (G) Cm-FF-H. The maps are contoured at a level of 10. The 
catalytic Cys145 of SARS-CoV MP"° forms a thiohemiacetal with the aldehyde group of the inhibitors. 


the thiohemiacetal oxygen forms a hydrogen bond with the imidaz- 
ole of His41 (3.30 A) of the catalytic dyad (Fig. 2C), whereas in the 


(S)-isomer, the oxygen atom points away from His41 and into the 
oxyanion hole, forming H-bonds with the main-chain amides of 
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Fig. 2. Dual configurations of the thiohemiacetal in the complex with Ac-NSFSQ-H. (A) and (B): Ac-NSFSQ-H binding at the active site in two alternative configurations. 
Hydrogen bonding interactions are represented by broken lines. Distances between atoms are shown in A. (C) and (D): Schematic diagram of the aldehyde inhibitor Ac- 
NSFSQ-H attacked by the catalytic cysteine 145, leading to the formation of two possible diastereomeric products. 


Gly143 (2.71 A) and Cys 145 (3.15 A) (Fig. 2D). Previous X-ray and 
NMR studies of aldehyde binding to various cysteine and serine 
proteases demonstrated that one or both configurations were pres- 
ent in the structures of the complexes (Delbaere and Brayer, 1985; 
Webber et al., 1998; Robin et al., 2009). 


3.4. The S2 subsite 


The S2 subsite is a large pocket lined by residues His41, Met49, 
Cys145, His164, Met165, Asp187, Arg188mc (mc: contribution 
from main-chain atoms only), and Gln 189 (Fig. 1). At the cleavage 
sites of the SARS-CoV M?"° in the viral polyproteins, the P2 position 
is mostly found to be occupied by Leu or Phe. Accordingly, almost 
all inhibitors designed for the enzyme carry a large hydrophobic 
group in the P2 position. We were therefore very surprised to find 
the Ser or Asp side chains in the P2 position of the aldehydes bound 
to this hydrophobic site (Fig. 1). In the SARS-CoV MP? complexes 
with Ac-NSTSQ-H or Ac-NSFSQ-H, there is no hydrogen-bonding 
interaction between the side chain of P2-Ser and residues of the 
S2 subsite. As the Ser side chain is small, there is no steric barrier 
for Ser binding in the large S2 pocket. The serine does not quite 
reach the “bottom” of the S2 pocket, but the distance between 
its Oy atom and the carbonyl oxygen of residue Gln192 is 4.0 A, 
leaving no space for a water molecule to locate in between. Accord- 
ingly, we did not find any electron density suggesting the presence 


of a well-ordered water in the S2 pocket, although disordered 
water is probably present in the pocket on both sides of the serine 
side-chain, where free spaces with a total volume of >70 A? are 
present between the substrate and the protein. It should be noted 
that the side-chain of GlIn189, which contributes to one wall of the 
S2 pocket, is quite flexible and adopts different conformations in 
the various inhibitor complexes, allowing or preventing the access 
of water to the pocket. 

In case of the SARS-CoV M?"° complex with Ac-DSFDQ-H, the P2 
Asp side-chain is situated between the side chains of Met49 and 
Met165, which form opposite walls of the S2 pocket. Most interest- 
ingly, its carboxylate oxygens undergo close interactions with the 
sulfur atoms of the two methionines. The distances (Trs...9) between 
the SS atoms of Met49 and Met165 on the one hand and the 062 
and O81 atoms, respectively, of P2-Asp on the other are 3.36 and 
3.45 A (Fig. 3). Then the relative distances (ds...) can be calculated 
according to ds; 9 =Ts..¢ — vdw(S) — vdw(O), where values of 1.80 
and 1.52 A are used as van der Waals radii (vdw) for S and O atoms, 
respectively. With dso =0.04 and 0.13 A, the distances between 
the P2-Asp and the two S2-Met side-chains fulfill the condition 
of ds..o<0.2A (Iwaoka et al., 2002) for a non-bonded S...0 
contact. These interactions are not in the plane of the methionine 
sulfides; the 6 values (0 is the polar angle between the normal to 
the sulfide plane and the S...0 vector (Pal and Chakrabarti, 
2001)) are 30.6° and 52.8°. Similar nonbonded interactions 
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Fig. 3. Interactions of P2-Asp in the S2 subsite in the complex structure SARS-CoV 
MPr°: Ac-DSFDQ-H. P2-Asp carboxylate oxygens interact with the sulfur atoms of 
Met49 and Met165. These non-bonded interactions are represented by dashed 
lines. Distances between atoms are shown in A. 


between the methionine sulfur atoms and main-chain carbonyl 
oxygens or carboxylate side-chains have been detected previously 
in the hydrophobic cores of proteins and were proposed to stabi- 
lize the protein fold (Pal and Chakrabarti, 2001). It has also been 
suggested that S...O interactions should be taken into account in 
protein engineering studies (Iwaoka et al., 2002; Pal and Chakrab- 
arti, 2001), but to the best of our knowledge, we provide here the 
first description of a methionine-carboxylate interaction in a pro- 
tein-ligand complex. The unexpected finding of Ser and Asp bind- 
ing in the S2 subsite constitutes a deviation from the dogma that 
peptide inhibitors of proteases should contain amino-acid residues 
corresponding to the sequence specificity of the target enzyme. 


3.5. Analysis of peptide substrates harboring different amino-acid 
residues in P2 


The unexpected observation of the P2-Asp residue of aldehyde 
Ac-DSFDQ-H binding to the hydrophobic S2 pocket prompted us 
to re-determine the cleavage specificity of the SARS-CoV MP" with 
respect to the P2 position. Within the context of the peptide sub- 
strate SWTSAVXQ|SGFRKWA, all 20 proteinogenic amino acids 
were tested in position P2 (=X). In the HPLC-based assay, the pro- 
tease concentration was kept constant at 0.5 uM. The canonical 
substrate with Leu in the P2 position was completely cleaved with- 
in two minutes and was therefore used as the standard to compare 
the proteolytic activities with other substrates. The results are 
listed in Fig. 4. Our study confirmed that the most favored residues 
in the P2 position of the substrate (after Leu) are the hydrophobic 
amino acids Phe, Met, Val, and Ile. Substrates with polar amino 
acids in P2 are poorly cleaved. The cleavage efficiency for the sub- 
strate with P2 =Ser was 160-times lower than for the best sub- 
strate (P2 = Leu); furthermore, the peptide with P2 = Asp was not 
cleaved at all after incubation with SARS-CoV MP" for 16 h. 


3.6. S1 subsite 


The S1 subsite is lined by residues Phe140, Asn142, Gly143, 
Ser144, Cys145, His163, Glu166, and His172. From the substrate 
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Fig. 4. Relative cleavage efficiencies of 20 peptide substrates harboring different 
amino-acid residues in P2. 


specificity results (Fan et al., 2004, 2005; Lai et al., 2006), P1 has 
to be Gln. Moreover, in all SARS-CoV MP'® complex structures de- 
scribed so far (Lee et al., 2005; Xue et al., 2007; Yang et al., 2005, 
2003; Yin et al., 2007), the S1 subsite is occupied by a polar group, 
i.e. the side chain of Gln or a five-membered lactam (used as a glu- 
tamine surrogate). Shie et al. synthesized a series of o,B-unsatu- 
rated esters containing both P1 and P2 phenylalanine residues 
which showed modest inhibitory activity against the SARS-CoV 
MPr° (ICs9 = 11-39 LM) (Shie et al., 2005). The o,f-unsaturated 
ethyl ester of 4-(dimethylamino)cinnamoyl-Phe-Phe was reported 
to be a potent inhibitor, with an inhibition constant of 0.52 uM 
(Shie et al., 2005). The computer model for the complex of SARS- 
CoV MP'° with the compound proposed that the P1 and P2 phenyl 
groups occupy the S2 and S3 pockets, respectively (Shie et al., 
2005). However, in the crystal structure of the complex with the 
aldehyde Cm-FF-H that we describe in this communication, the 
side chain of P1-Phe is inserted into the S1 subsite (Fig. 1G). In or- 
der to understand this unexpected deviation from the commonly 
observed specificity, we compared the interaction with the S1 sub- 
site observed in our structures of the complexes with Cm-FF-H and 
Ac-ESTLQ-H, the latter of which corresponds exactly to the 
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Fig. 5. Superimposition of the structures of SARS-CoV MP'° complexes with Cm-FF- 
H (red) and Ac-ESTLQ-H (green) showing the conformational changes in the S1 
subsite. (For interpretation of the references to color in this figure legend, the reader 
is referred to the web version of this article.) 
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described cleavage specificity of the M?"°. The predominant S1 
specificity of the enzyme for Gln is determined primarily by 
His163. In the complex formed with Ac-ESTLQ-H, Ne2 of His163 
donates a 2.83-A hydrogen bond to the side-chain carbonyl oxygen 
(O¢1) of P1-Gln (Fig. 1). In the complex with Cm-FF-H, the side 
chain of Asn142 undergoes an 83.8° rotation about its 71 torsion 
angle compared to the conformation in the complex with Ac- 
ESTLQ-H (Fig. 5), leading to an opening of the S1 subsite towards 
the solvent. A similar movement of Asn142 has been observed in 
the crystal structure of SARS-CoV MP’ in complex with an inhibi- 
tor called “‘N1” (PDB ID: 1WOF), where both the regular conforma- 
tion of this residue and the one observed in our structure were 
found (Yang et al., 2005). The P1-benzyl group of Cm-FF-H binds 
to the S1 subsite by making hydrophobic interactions with 
Phe140, Leu141, Asn142, and the P3 cinnamoyl group of Cm- 
FF-H. As the S1 subsite specificity does not seem to be as stringent 
as previously thought. It may well be possible that other chemical 
groups can be accommodated here, thereby expanding the oppor- 
tunities for designing and synthesizing efficient inhibitors of the 
SARS-CoV MP", 


4. Conclusions 


In this study, we report six crystal structures of SARS-Cov M?"° 
in complex with peptide aldehydes and the inhibition kinetics of 
these compounds. The crystal structures reveal the mechanism of 
aldehyde binding to the active site of SARS-Cov M?"° and provide 
detailed information on the atomic interactions. Since Asp or Ser 
were found to bind to the hydrophobic S2 subsite, and Phe was 
located in the hydrophilic S1 subsite of SARS-Cov MP", we con- 
clude that the stringent substrate specificity of the SARS-CoV M?"° 
with respect to the P1 and P2 positions can be overruled by the 
highly electrophilic character of the aldehyde warhead. This con- 
stitutes a deviation from the dogma that peptidic protease inhibi- 
tors should comprise an amino-acid sequence corresponding to the 
cleavage specificity of the target enzyme. The observed non- 
bonded interaction of the carboxylate oxygen atoms of an aspar- 
tate residue of an inhibitor with the thioether moieties of two 
methionines forming part of an hydrophobic pocket of the target 
protein is remarkable and suggests that such S...O interactions 
should be added to the repertoire of computer-aided drug design. 


4.1. Accession numbers 


Atomic coordinates and structure factors have been deposited 
in the RCSB Protein Data Bank under ID codes 3SN8, 3SNA, 3SNB, 
3SNC, 3SND, and 3SNE (see Table 1). 
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